127 research outputs found

    Joint segmentation of many aCGH profiles using fast group LARS

    Full text link
    Array-Based Comparative Genomic Hybridization (aCGH) is a method used to search for genomic regions with copy numbers variations. For a given aCGH profile, one challenge is to accurately segment it into regions of constant copy number. Subjects sharing the same disease status, for example a type of cancer, often have aCGH profiles with similar copy number variations, due to duplications and deletions relevant to that particular disease. We introduce a constrained optimization algorithm that jointly segments aCGH profiles of many subjects. It simultaneously penalizes the amount of freedom the set of profiles have to jump from one level of constant copy number to another, at genomic locations known as breakpoints. We show that breakpoints shared by many different profiles tend to be found first by the algorithm, even in the presence of significant amounts of noise. The algorithm can be formulated as a group LARS problem. We propose an extremely fast way to find the solution path, i.e., a sequence of shared breakpoints in order of importance. For no extra cost the algorithm smoothes all of the aCGH profiles into piecewise-constant regions of equal copy number, giving low-dimensional versions of the original data. These can be shown for all profiles on a single graph, allowing for intuitive visual interpretation. Simulations and an implementation of the algorithm on bladder cancer aCGH profiles are provided

    The group fused Lasso for multiple change-point detection

    Get PDF
    We present the group fused Lasso for detection of multiple change-points shared by a set of co-occurring one-dimensional signals. Change-points are detected by approximating the original signals with a constraint on the multidimensional total variation, leading to piecewise-constant approximations. Fast algorithms are proposed to solve the resulting optimization problems, either exactly or approximately. Conditions are given for consistency of both algorithms as the number of signals increases, and empirical evidence is provided to support the results on simulated and array comparative genomic hybridization data

    Maturity Mismatch and Financial Crises: Evidence from Emerging Market Corporations

    Get PDF
    Substantial attention has been paid in recent years to the risk of maturity mismatch in emerging markets. Although this risk is microeconomic in nature, the evidence advanced thus far has taken the form of macro correlations. This paper empirically evaluates this mechanism at the micro level by using a database of over 3,000 publicly traded firms from fifteen emerging markets. The paper measures the risk of short-term exposure by estimating, at the firm level, the effect on investment of the interaction of short-term exposure and aggregate capital flows. This effect is (statistically) zero, contrary to the prediction of the maturity-mismatch hypothesis. This conclusion is robust to using a variety of different estimators, alternative measures of capital flows, and controls for devaluation effects and access to international capital. The paper finds evidence that short-term-exposed firms pay higher financing costs and liquidate assets at fire sale prices, but the paper does not find that this reduction in net worth translates into a drop in investment.

    Corporate Dollar Debt and Depreciations: Much Ado About Nothing?

    Get PDF
    Much has been written recently about the problems for emerging markets that might result from a mismatch between foreign-currency denominated liabilities and assets (or income flows) denominated in local currency. In particular, several models, developed in the aftermath of financial crises of the late 1990s, suggest that the expansion in the "peso" value of "dollar" liabilities resulting from a devaluation could, via a net worth effect, offset the expansionary competitiveness effect. Assessing which effect dominates is ultimately an empirical matter. In this vein, this paper constructs a new database with accounting information (including the currency composition of liabilities) for over 450 non-financial firms in five Latin American countries. The authors estimate, at the firm level, the reduced-form effect on investment of holding foreign-currency-denominated debt during an exchange-rate realignment. It is consistently found that, contrary to the predicted sign of the net-worth effect, firms holding more dollar debt do not invest less than their counterparts in the aftermath of a depreciation. The paper shows that this result is due to firms matching the currency denomination of their liabilities with the exchange-rate sensitivity of their profits. Because of this matching, the negative balance-sheet effects of a depreciation on firms holding dollar debt are offset by the larger competitiveness gains of these firms.

    Long signal change-point detection

    Get PDF
    The detection of change-points in a spatially or time ordered data sequence is an important problem in many fields such as genetics and finance. We derive the asymptotic distribution of a statistic recently suggested for detecting change-points. Simulation of its estimated limit distribution leads to a new and computationally efficient change-point detection algorithm, which can be used on very long signals. We assess the algorithm via simulations and on previously benchmarked real-world data sets

    The Statistical Performance of Collaborative Inference

    Get PDF
    The statistical analysis of massive and complex data sets will require the development of algorithms that depend on distributed computing and collaborative inference. Inspired by this, we propose a collaborative framework that aims to estimate the unknown mean θ\theta of a random variable XX. In the model we present, a certain number of calculation units, distributed across a communication network represented by a graph, participate in the estimation of θ\theta by sequentially receiving independent data from XX while exchanging messages via a stochastic matrix AA defined over the graph. We give precise conditions on the matrix AA under which the statistical precision of the individual units is comparable to that of a (gold standard) virtual centralized estimate, even though each unit does not have access to all of the data. We show in particular the fundamental role played by both the non-trivial eigenvalues of AA and the Ramanujan class of expander graphs, which provide remarkable performance for moderate algorithmic cost

    Progress and open challenges in extremely high-dimensional medical outcome prediction

    No full text
    National audienceUsing biological data for medical decisions requires ”extremely high” prediction accuracy ; mistakes can lead to death

    Between-Subject and Within-Subject Model Mixtures for Classifying HIV Treatment Response

    Get PDF
    We present a method for using longitudinal data  to classify individuals into clinically-relevant population subgroups. This is achieved by treating ``subgroup'' as a categorical covariate whose value is unknown for each individual, and predicting its value using mixtures of models that represent ``typical'' longitudinal data from each subgroup.  Under a nonlinear mixed effects model framework, two types of model mixtures are presented, both of which have their advantages. Following illustrative simulations, longitudinal viral load data for HIV-positive patients is used to predict whether they are responding -- completely, partially or not at all -- to a new drug treatment

    Automatic data binning for improved visual diagnosis of pharmacometric models

    Get PDF
    International audienceVisual Predictive Checks (VPC) are graphical tools to help decide whether a given model could have plausibly generated a given set of real data. Typically, time-course data is binned into time intervals, then statistics are calculated on the real data and data simulated from the model, and represented graphically for each interval. Poor selection of bins can easily lead to incorrect model diagnosis. We propose an automatic binning strategy that improves reliability of model diagnosis using VPC. It is implemented in version 4 of the Monolix software
    • …
    corecore